Search CORE

3 research outputs found

Language-independent pre-processing of large document bases for text classification

Author: Justin. Wang Yanbo
Publication venue
Publication date
Field of study

Text classification is a well-known topic in the research of knowledge discovery in databases. Algorithms for text classification generally involve two stages. The first is concerned with identification of textual features (i.e. words andlor phrases) that may be relevant to the classification process. The second is concerned with classification rule mining and categorisation of "unseen" textual data. The first stage is the subject of this thesis and often involves an analysis of text that is both language-specific (and possibly domain-specific), and that may also be computationally costly especially when dealing with large datasets. Existing approaches to this stage are not, therefore, generally applicable to all languages. In this thesis, we examine a number of alternative keyword selection methods and phrase generation strategies, coupled with two potential significant word list construction mechanisms and two final significant word selection mechanisms, to identify such words andlor phrases in a given textual dataset that are expected to serve to distinguish between classes, by simple, language-independent statistical properties. We present experimental results, using common (large) textual datasets presented in two distinct languages, to show that the proposed approaches can produce good performance with respect to both classification accuracy and processing efficiency. In other words, the study presented in this thesis demonstrates the possibility of efficiently solving the traditional text classification problem in a language-independent (also domain-independent) manner

University of Liverpool Repository

Language-independent pre-processing of large document bases for text classification

Author: Wang Yanbo Justin
Publication venue
Publication date: 01/01/2008
Field of study

EThOS - Electronic Theses Online ServiceGBUnited Kingdo

OpenGrey Repository

ANALYSIS OF MESENCHYMAL STEM CELL DIFFERENTIATION IN VITRO

Author: FRANS COENEN
James M.
Quinlan J. R.
RENÉ BAÑARES-ALCÁNTARA
Schaffer C.
WEIQI WANG
YANBO JUSTIN WANG
ZHANFENG CUI
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date
Field of study

Crossref